As part of the data visualization subdivision of Ethereum Foundation community staking grantees, our research seeks to provide graphical insights into the function and health of the network. Over the last few months, we have worked to update our analysis on the performance of validator nodes which was originally performed on the Medalla Testnet. In this article, we will provide an update on the progress, highlighting the work we’ve done both in terms of the data infrastructure and analysis. Specifically, the three major updates we’ve performed are as follows:
We will cover each of these pieces in this post.
This section covers the steps we took to configure a database backend for the Validator analysis.
The first major update was to create a robust and scalable data backend. For the Medalla Testnet, we collected the data ad-hoc in a data scraping procedure that extracted information from Beaconscan. For this update, we’ve built upon chaind to pull data directly from the Ethereum blockchain into a structured PostgreSQL database.
For the technically curious, these are the exact steps we took to configure the above setup on our Ubuntu 20.04 analytics server.
In Figure 2.1 our Teku Beacon node is catching up to the head slot. When it is complete, the data is available on the server for chaind to begin synchronization to the database.
Figure 2.1: Left: A Teku Beacon node synchronizing from the Ethereum blockchain to our server. Right: A Teku Beacon node synchronizing from the Ethereum blockchain to our server.
The tables in this database now contain the data needed to recreate our validator analysis. With some minor manipulation and joins of the raw data, we obtain a dataset that matches the original structure of the data collected from the Medalla Testnet, which can be seen below:
| publickey | index | currentBalance | effectiveBalance | assigned | executed | skipped | eligibilityEpoch | activationEpoch | exitEpoch | withEpoch | slashed |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0x8e968b….77adc40b | 55738 | 288.1698 | 32 | 3 | 3 | 0 | 6265 | 9144 | NA | NA | FALSE |
| 0xb228bd….f9be419b | 53633 | 191.7624 | 32 | 2 | 0 | 2 | 5051 | 8618 | NA | NA | FALSE |
| 0xaf7cc1….18ee94ff | 44766 | 191.6700 | 32 | 4 | 0 | 4 | 4155 | 6401 | NA | NA | FALSE |
| 0x91845a….048a358b | 25550 | 190.1818 | 32 | 10 | 3 | 7 | 311 | 1404 | NA | NA | FALSE |
| 0x81ccb4….e5130868 | 34231 | 160.3910 | 32 | 9 | 7 | 2 | 3312 | 3768 | NA | NA | FALSE |
| 0xb8cd03….90adeb16 | 52757 | 159.7610 | 32 | 5 | 0 | 5 | 4984 | 8399 | NA | NA | FALSE |
| 0x8a6120….1bb56a62 | 23018 | 158.8510 | 32 | 16 | 12 | 4 | 119 | 771 | NA | NA | FALSE |
| 0xadeac9….8470b19c | 14618 | 158.1426 | 32 | 17 | 3 | 14 | 0 | 0 | NA | NA | FALSE |
| 0x89cec3….82f251aa | 21754 | 158.0481 | 32 | 14 | 0 | 14 | 36 | 455 | NA | NA | FALSE |
| 0x97bdad….f7972c3e | 42259 | 128.3422 | 32 | 6 | 6 | 0 | 3993 | 5775 | NA | NA | FALSE |
In this section, we will survey the results from our new analysis of validator performance, comparing and contrasting the old (Medalla) results to the new results. At a high level, our findings indicate that validator performance has generally increased across the board.
The Medalla Testnet data spanned 15,450 epochs and included a total of 80,392 validators overall, beginning with the genesis block on August 4th, 2020. By contrast, as of this writing (April 21st, 2021) the Ethereum Mainnet data includes 31,592 epochs with slots assigned to 121,335 validators, beginning from the genesis block on December 1st, 2020.
Some epochs within the Medalla test phase activated more than 4 validators, this has not happened on the current Beacon chain.
During the last 3000 or so epochs, there have been a number of attestation periods that have had fewer than the standard 4 validators activated within them.
Constant validator inflows and outflows on the Medalla testnet saw a significant number of nodes without any assignments. One of the first hints that we have that validators on the mainnet have been performing well, has been the larger average number of assignments. The peak of the distribution for the mainnet on the right shows that many validators have been assigned to at least five attestations. As we continue to track this specific distribution, it is likely to begin to skew leftward as the early cohorts successfully validate blocks.
When we begin to look more deeply at the breakdown of the assignments, executions and skips across the blocks we see a similar pattern where the steady performance, and time on the network, of the validators have increased the average number successful proposals per validator. We can also see that the number of skipped slots has decreased dramatically when comparing the shape of the distribution and the average number between the two networks.
Given what we’ve seen above, it is no surprise that the execution rate, as measured by the number of executed blocks over assigned blocks, is much closer to 100% for all active validators on the Ethereum 2.0 mainnet.
Likewise the skip rate, a measure of the number of blocks skipped divided by the number assigned, has plummeted which suggests that the validators are completing their attestation duties completely and correctly.
The time to exit distribution is still right-skewed, with most exiting quite early. However, since the launching of mainnet, most misconfigured nodes actually leave within the first 200 hours. A secondary cluster peak was found at 1500 hours, which motivated the following analysis that looks into the time series of validator exits.
When looking at the exiting validators by epoch compared to the Medalla Testnet, we can immediately see that validators have taken their job more seriously. Only 144 validators have exited in the over 31,000 epochs tracked since the Beacon chain’s inception. There was a noticeable spike between 14,000 and 15,000.
Like the Medalla chain, the Beacon chain experiences spikes in exits over relatively short periods of time. Here between the Epoch 14000 and 15000 bands, we went from slightly over 40 exited validators to slightly more than 140. On a positive note, there are much longer periods of time where there are either no exits or very few per epoch.
Much like the exits over time, the number of slashings appear to occur in bulk where, again, the majority of slashings occurred during February 1st through 6th (epochs 14,000 and 15,000). As it turns out, this was due to a double-signing mishap by a single staking provider.
All in all, the theme of validators generally performing better than was see on the Medalla Testnet is prevalent through most of this analysis. We now turn our attention to deriving a new set of tiers based on the new data.
Using the Mainnet data, we performed the same feature derivation and clustering routine in order to attempt to derive a new tier distribution for the validators. However, when we applied the previous procedure directly to the mainnet data, our scoring system proved inadequate. Here is the distribution of scores obtained when applying the old procedure. You can immediately see that there is a lack of distinctiveness between tiers of validators, with the perfect validators linearly extrapolated along a scale, and imperfect validators below that. In other words, there is not enough differentiation obtained in order to neatly map this score distribution to validator tiers.
Figure 4.1: Sorted validator scores when applying old scoring procedure to Mainnet data (right) compared with the previous Medalla results (left).
The reason for the breakdown is multi-fold. For the Medalla test data, the validator performance more naturally lent itself to a tier-based structure. The score thresholds were more distinct, and when analyzing the behavior within tiers, the behavior was more consistent. With the Mainnet data, largely due to the increase in validator performance across the board, our old thresholds failed to perform well - the vast majority of validators would have achieved tier 1. Furthermore, the distribution of scores was not normal, with several different areas along the curve with a skewed score distribution.
We then took an alternative approach. We decided to use the previous set of clusters to inform a pre-defined set of groupings that would provide coverage over the set of validators and their behavior. In this way, we ensure that the clusters themselves represent distinct and interesting behaviors with an implication as to their overall performance. Ultimately, we settled on the following 7 tiers:
These thresholds allow validators to be placed into 7 tiers where we sort according to the number of assignments and time on the network. Within each tier, scores are derived using the numeric variables provided, preferring validators that have a large number of assignments, have been on the network the longest, and have the fewest skipped slots. Our pre-defined thresholds spread out the distribution of scores as we intended! When we compare the old scoring procedure to the new, the improvement in the distribution is immediately obvious.
Figure 4.2: Sorted validator scores when applying new scoring procedure to Mainnet data (right) compared with the old scoring procedure (left).
Finally, we can compare directly the final score distribution for our new approach to the original Medalla distribution and see that we’ve captured many of the original characteristics, with more well defined tiers and more differentiation between the tiers.
Once we sorted within the groupings we were able to finalize the tiers.
The most telling aspect of the new clustering is how well nearly ALL validators are performing. Tiers 6 and 7 would be considered non productive contributors having either been slashed or skipping more blocks than they’ve been assigned. Aside from these 440 errant validators, nearly every other participant has a positive impact on the network. There are over 93,000 validators who have perfect proposition rates with at least 2 assignments.
From a macro perspective these tiers are differentiating the network validators behaviors effectively and show that the overall health of the network is good.
| Tier | Description | Count | Time Active | Successful Blocks | Skipped Blocks | Percentage Slashed |
|---|---|---|---|---|---|---|
| 1 | Perfect Proposers with at least 2 assignments | 100402 | 2089.7764 | 8.9862353 | 0.0000000 | 0 |
| 2 | Validators with 90% success rate | 4247 | 2904.9413 | 14.5147163 | 1.0348481 | 0 |
| 3 | Perfect Inexperienced Nodes | 6341 | 567.0764 | 1.0000000 | 0.0000000 | 0 |
| 4 | Completely Inexperienced | 6200 | 141.6200 | 0.0000000 | 0.0000000 | 0 |
| 5 | All others with >= .5 Success rate | 3671 | 2164.9815 | 7.5069463 | 1.7711795 | 0 |
| 6 | Nodes with < .5 Success rate | 340 | 1448.0587 | 0.9823529 | 4.9058824 | 0 |
| 7 | Slashed and Gone | 134 | 460.4983 | 2.7761194 | 0.0820896 | 100 |
Once these tiers were derived, our next task was to update the live dashboard. Figure 4.3 shows the updated dashboard with the latest data. We maintained the simple, functional design from the original interface, with two tabs:
Figure 4.3: Left: The Statistics tab of our validator dashboard, highlighting the high level overview of the validator performance and tiers. Right: The Data tab of our validator dashboard, showing the performance and rank of each individual validator.
The following statistics are provided in the dashboard:
The tier labels and distributions are provided below the statistics, along with group averages across a number of the important statistics for each tier.
While we are pleased with the results so far, this work is absolutely not complete. As the Ethereum blockchain continues to grow and more validators join the network, we would love assistance in continuing to develop robust statistics around validator performance, and methods for characterizing validator behavior. If you are interested in getting involved, please reach out to us on twitter.
This article was written as part of an Ethereum Foundation Staking Community Grant. Many thanks to both Lakshman Sankar and Jim McDonald for their technical assistance and support.